Dereverberation of autoregressive envelopes for far-field speech recognition
نویسندگان
چکیده
The task of speech recognition in far-field environments is adversely affected by the reverberant artifacts that elicit as temporal smearing sub-band envelopes. In this paper, we develop a neural model for dereverberation using long-term envelopes speech. are derived frequency domain linear prediction (FDLP) which performs an autoregressive estimation Hilbert estimates envelope gain when applied to signals suppresses late reflection components signal. dereverberated used feature extraction recognition. Further, sequence steps involved dereverberation, and acoustic modeling ASR can be implemented single processing pipeline allows joint learning network model. Several experiments performed on REVERB challenge dataset, CHiME-3 dataset VOiCES dataset. these experiments, yields significant performance improvements over baseline system based log-mel spectrogram well other past approaches (average relative 10–24% system). A detailed analysis choice hyper-parameters cost function also provided.
منابع مشابه
Adaptive Multichannel Dereverberation for Automatic Speech Recognition
Reverberation is known to degrade the performance of automatic speech recognition (ASR) systems dramatically in farfield conditions. Adopting the weighted prediction error (WPE) approach, we formulate an online dereverberation algorithm for a multi-microphone array. The key contributions of this paper are: (a) we demonstrate that dereverberation using WPE improves performance even when the acou...
متن کاملCoherence-based Dereverberation for Automatic Speech Recognition
The idea of performing dereverberation using a short-time spatial coherence estimate dates back to 1977 [1], when it was proposed to essentially use the magnitude of the coherence as gain for reverberation suppression. Another heuristic method was recently proposed in [2], where a soft threshold function is used to compute a gain from the coherence magnitude, and the parameters of the threshold...
متن کاملTracking and Far-Field Speech Recognition for Multiple Simultaneous Speakers
In prior work, we developed a speaker tracking system based on an extended Kalman filter using time delays of arrival (TDOAs) as acoustic features. While this system functioned well, its utility was limited to scenarios in which a single speaker was to be tracked. In this work, we remove this restriction by generalizing the IEKF, first to a probabilistic data association filter, which incorpora...
متن کاملFeature mapping using far-field microphones for distant speech recognition
Acoustic modeling based on deep architectures has recently gained remarkable success, with substantial improvement of speech recognition accuracy in several automatic speech recognition (ASR) tasks. For distant speech recognition, the multi-channel deep neural network based approaches rely on the powerful modeling capability of deep neural network (DNN) to learn suitable representation of dista...
متن کاملHilbert Envelope Based Features for Far-Field Speech Recognition
Automatic speech recognition (ASR) systems, trained on speech signals from close-talking microphones, generally fail in recognizing far-field speech. In this paper, we present a Hilbert Envelope based feature extraction technique to alleviate the artifacts introduced by room reverberations. The proposed technique is based on modeling temporal envelopes of the speech signal in narrow sub-bands u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Speech & Language
سال: 2022
ISSN: ['1095-8363', '0885-2308']
DOI: https://doi.org/10.1016/j.csl.2021.101277